Temporal Early Exits for Efficient Video Object Detection
نویسندگان
چکیده
منابع مشابه
Spatial-Temporal Memory Networks for Video Object Detection
We introduce Spatial-Temporal Memory Networks (STMN) for video object detection. At its core, we propose a novel Spatial-Temporal Memory module (STMM) as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM’s design enables the integration of ImageNet pre-trained backbone CNN weights for both the feature stack as well as the prediction head, which ...
متن کاملDeep Spatial-Temporal Joint Feature Representation for Video Object Detection
With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ens...
متن کاملSpatio-temporal Features for Efficient Video Copy Detection
Content-Based Video Copy Detection (CBVCD) aims at detecting whether or not a query video is a copy or part of a reference video from database. In this paper, we present aCBVCD systembased on spatiotemporal features that can competitively deal with large database in terms of both performance and efficiency. Instead of selecting keyframes or uniformly sampling from original videos and then extra...
متن کاملEfficient Online Spatio-Temporal Filtering for Video Event Detection
We propose a novel spatio-temporal filtering technique to improve the per-pixel prediction map, by leveraging the spatio-temporal smoothness of the video signal. Different from previous techniques that perform spatio-temporal filtering in an offline/batch mode, e.g., through graphical model, our filtering can be implemented online and in real-time, with provable lowest computational complexity....
متن کاملEfficient Co-Salient Video Object Detection Based on Preattentive Processing
Automatic video annotation is a critical step for contentbased video retrieval and browsing. Detecting the focus of interest such as co-occurring objects in video frames automatically can benefit the tedious manual labeling process. However, detecting the co-occurring objects that is visually salient in video sequences is a challenging task. In this paper, in order to detect co-salient video ob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Social Science Research Network
سال: 2022
ISSN: ['1556-5068']
DOI: https://doi.org/10.2139/ssrn.4001359